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ABSTRACT 

DHH superfamily includes RecJ, nanoRNases (NrnA), 
cyclic nucleotide phosphodiesterases and pyrophos- 
phatases. In this study, we have carried out in 
vitro and in vivo investigations on the bifunctional 
NrnA-homolog from Mycobacterium smegmatis, 
MSMEG.2630. The crystal structure of MSMEG 2630 
was determined to 2.2-A resolution and reveals 
a dimer consisting of two identical subunits with 
each subunit folding into an N-terminal DHH do- 
main and a C-terminal DHHA1 domain. The overall 
structure and fold of the individual domains is simi- 
lar to other members of DHH superfamily. However, 
MSMEG 2630 exhibits a distinct quaternary structure 
in contrast to other DHH phosphodiesterases. This 
novel mode of subunit packing and variations in the 
linker region that enlarge the domain interface are re- 
sponsible for alternate recognitions of substrates in 
the bifunctional nanoRNases. MSMEG_2630 exhibits 
bifunctional 3 -5 exonuclease [on both deoxyribonu- 
cleic acid (DNA) and ribonucleic acid (RNA) sub- 
strates] as well as CysQ-like phosphatase activity 
(on pAp) in vitro with a preference for nanoRNA 
substrates over single-stranded DNA of equivalent 
lengths. A transposon disruption of MSMEG-2630 in 
M. smegmatis causes growth impairment in the pres- 
ence of various DNA-damaging agents. Further phy- 
logenetic analysis and genome organization reveals 
clustering of bacterial nanoRNases into two distinct 



subfamilies with possible role in transcriptional and 
translational events during stress. 

INTRODUCTION 

NanoRNA (2-5 nucleotides) is generated in several cellu- 
lar processes such as messenger ribonucleic acid (mRNA) 
degradation, abortive transcription initiation, RNA cleav- 
age during transcription elongation and cyclic nucleotide 
degradation (1,2). NanoRNA has the ability to affect the 
physiological state of the cell and must be degraded com- 
pletely. Accumulation of nanoRNA leads to altered prim- 
ing of transcription initiation and hence changes in expres- 
sion levels of various genes that could be lethal for the 
normal functioning of cell (3). These small-sized nanoR- 
NAs are processed by special enzymes in bacterial cells. In 
Escherichia coli, the degradation of nanoRNA is carried 
out by a highly conserved enzyme oligoribonuclease, ORN, 
deletion of which is detrimental toward growth (4). Despite 
its important physiological role, ORN is not present in all 
bacteria. ORN is replaced by a redundant nanoRNase fam- 
ily, first identified in Bacillus subtilis, homologs of which are 
also found in several other bacterial groups (5). 

Several classes of nanoRNases, NrnA and NrnB in B. 
subtilis and NrnC in Bartonella birtlesii, have been identi- 
fied (5-7) with distinct sequence features. Apart from con- 
served sequence motifs necessary for activity, the sequences 
of nanoRNase families vary significantly. Consequently, se- 
quence similarity of NrnC with NrnB and NrnA is only 
16% and 11%, respectively (7). While NrnAs are bifunc- 
tional enzymes with nanoRNase and CysQ activity (313 
residues in B. subtilis NrnA), only a nanoRNase activity 
has been identified in NrnB (399 residues in B. subtilis) 
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and NrnC (181 residues in B. birtlesii). NrnA and NrnB 
belong to DHH superfamily of phosphodiesterases, which 
also include RecJ-like exonucleases, inorganic exopolyphos- 
phatases, family II pyrophosphatases, while NrnC belongs 
to DEDD family of exonucleases (7). The phylogenetic ori- 
gin of the multiple classes of nanoRNases and their physio- 
logical roles remain unclear. The recent crystal structure of 
NrnA-homolog from Bacteroides fragilis (8) at 2.95-A res- 
olution in ligand-bound form (with GMP) shows interac- 
tions with several residues in the C-terminal domain (CTD). 
However, the molecular basis of selective recognition of 
nanoRNA by nanoRNases over pyrophosphates is still not 
clear. The crystal structure of ORN from Xanthomonas 
campestris (9) is also available but exhibits no structural 
similarity with B. fragilis NrnA and the two proteins con- 
tain completely different structural folds. This large amount 
of variation in sequence as well as in structure among var- 
ious ORN and nanoRNase families suggests independent 
evolutionary origin of these enzymes despite related func- 
tions. 

NrnA is a member of DHH phosphodiesterase superfam- 
ily widely distributed across several bacterial groups that 
do not harbor orn homologs (5,10). Actinobacteria is the 
only bacterial group identified so far, where both ORN 
and NrnA are present (5). The presence of both ORN and 
nanoRNase having similar kind of functions suggests that 
RNA processing pathway in Actinobacteria may differ sub- 
stantially as compared to other bacteria and needs to be 
studied in detail. Recently, Rv2837c of Mycobacterium tu- 
berculosis, a 336 residue protein, has been reported to be the 
Afunctional NrnA counterpart that can complement E. coli 
orn~ or cysQ~ strains (10). 

Here, we identify MSMEG_2630 of Mycobacterium 
smegmatis as an ortholog of Rv2837c and investigate the 
role of this nanoRNase through a combination of struc- 
tural, in vivo and in vitro biochemical studies. Like pre- 
viously characterized NrnA members, MSMEG_2630 ex- 
hibits bifunctional 3 / -5 / exoribonuclease on RNA substrates 
as well as CysQ-like pAp phosphatase activity in vitro. 
In addition, we find MSMEG_2630 also degrades single- 
stranded deoxyribonucleic acid (ssDNA) in 3 / -5 / direction 
in vitro, although nanoRNA substrates appear to be the pre- 
ferred substrate. A transposon mutant of MSMEG_2630 in 
M. smegmatis is not lethal but shows growth impairment 
in the presence of various DNA-damaging agents. We have 
also determined the crystal structure of this mycobacterial 
nanoRNase to 2. 2- A resolution and investigate its struc- 
tural features to atomic details. The structure of M. smeg- 
matis nanoRNase reveals a dimer consisting of two iden- 
tical subunits. However, the subunit packing is completely 
different from other DHH phosphodiesterase structures, 
suggesting that mycobacteria and other microbes harbor- 
ing nanoRNases utilize alternate packaging of the same 
well-characterized domains to obtain specific functional re- 
quirements in RNA processing (ORN or nanoribonucle- 
ase), DNA repair (RecJ exonucleases) or phosphatase (py- 
rophosphatases) activity. Further sequence analysis and dif- 
ference in genomic location of MSMEGJ2630 reveal that 
nanoRNases of different bacteria are present in two func- 
tionally distinct subfamilies. Our findings indicate that M. 



smegmatis MSMEG_2630 belongs to a unique group of 
nanoRNases with possible role in transcriptional and trans- 
lational events during stress. 

MATERIALS AND METHODS 

Cloning, expression and purification of MSMEG_2630 

The coding region of MSMEG_2630 was polymerase chain 
reaction (PCR) amplified from genomic DNA of M. smeg- 
matis mc 2 155 using forward and reverse primers (Supple- 
mentary Table SI). The forward primer introduced a Ncol 
site at the start codon at the 5' end, whereas the reverse 
primer introduced a Hindlll site at the 3 f end. The PCR 
products were digested with Ncol and Hindlll restric- 
tion enzymes and cloned in a similarly digested pET-28a 
(Novagen) vector to give the expression plasmid pRPS25. 
pRPS25 encodes full-length MSMEG_2630 fused to a C- 
terminal 6 x His-tag separated by a linker peptide. pRPS25 
was transformed into E. coli BL21(DE3) for expression. 
The transformed cells were grown at 37°C to A m ~ 0.5 in 
Luria Bertani (LB) medium and the culture was induced 
for protein expression with 0.1-mM IPTG (isopropyl (3-D- 
1-thiogalactopyranoside) for 12 h with constant shaking at 
18°C. Subsequently, the cells were harvested at 8000 g for 10 
min at 4°C. The pellet was resuspended in sonication buffer 
(50-mM Tris-HCl, pH 8.0) and lysed by sonication. The cell 
debris was removed by centrifugation and the His-tagged 
protein was purified over nickel-nitrilotriacetate (Ni-NTA) 
agarose beads (Invitrogen) as per manufacturer's protocol. 
The protein was eluted with 250-mM imidazole (in sodium 
phosphate buffer, pH 6.0, and 500-mM NaCl). Frac- 
tions containing significant amounts of purified protein, 
as examined by sodium dodecyl sulfate-polyacrylamide gel 
electrophoresis (SDS-PAGE) (10%), were pooled. Recom- 
binant His-tagged MSMEG_2630 thus purified contains 
full-length MSMEG_2630 polypeptide with 13 additional 
residues at the C-terminus comprising the 6 x His-tag and 
the linker peptide and was designated 'rMSMEG_2630- 
A\ Buffer exchange was carried out by gel filtration and 
the protein was finally suspended into 20-mM Tris-HCl, 
pH 7.5. Protein purity was assessed on 10% SDS-PAGE 
and concentration was estimated using Bicinchoninic acid 
(Sigma). All activity assays described below were carried 
out using rMSMEG_2630-A, unless indicated otherwise. 

Construct for crystallization experiments 

The coding region of MSMEG_2630 was PCR amplified 
from pRPS25 using forward (Supplementary Table SI) and 
reverse primers and cloned in pET-28-HislO-Smt3 vector at 
BamHI and Xhol sites to give expression plasmid pRPS26. 
pRPS26 encodes full-length MSMEG_2630 fused to an N- 
terminal Hisl0-Smt3 tag. The recombinant construct was 
transformed into E. coli BL21(DE3) and the tagged protein 
was purified as described previously (11). The eluted protein 
was incubated overnight with Smt3-specific protease Ulpl 
(Ulpl: protein ratio of 1:500) at 4°C to cleave the HislO- 
Smt3 tag. Tag-free protein was recovered in flow through by 
passage over Ni-NTA agarose column (Qiagen). Removal 
of the Smt3 tag left one serine residue at the N-terminus 
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of the purified protein. The recombinant protein thus puri- 
fied was dialyzed against low salt buffer (20-mM Tris-HCl, 
pH 8.5, 25-mM NaCl, 5% glycerol, 2-mM MgCl 2 ). Further 
downstream purification over an anion exchange column 
(MonoQ, GE Healthcare) followed by a size-exclusion step 
over Superdex 75 column (GE Healthcare) was necessary 
to remove other contaminants and obtain purified protein, 
designated 'rMSMEG_2630-B' for reproducible crystalliza- 
tions. The purity of rMSMEG_2630-B was checked on a 
12% SDS-PAGE and the protein was concentrated to 30 
mg/ml using Amicon ultra 3-kDa molecular weight cutoff 
filter units and stored at 4°C until further use. 

Constructs expressing proteins corresponding to residues 
1-188, 1-211 and 212-340 of MSMEG_2630 fused to anN- 
terminal Hisl0-Smt3 tag were similarly prepared and des- 
ignated MS-NTD-A, MS-NTD-B and MS-CTD, respec- 
tively. The proteins were purified as for full-length native 
protein described above. Activity assays with these con- 
structs were carried out for the respective proteins in the 
presence of the in-frame fused tag as indicated. Full-length 
MSMEG_2630 purified along with the in-frame fused tag 
(and designated Ntag_MSMEG_2630) was used as controls 
in the activity assays to confirm that the tag did not affect 
activity. 

Expression and purification of selenomethionine derivatized 
MSMEG 2630 

pRPS26 was transformed into E. coli BL21(DE3) and 
grown on selenomethionine (Se-Met) minimal media 
(Molecular Dimensions) as described by the manufacturer's 
protocol. Se-Met derivatized rMSMEG_2630-B was puri- 
fied as native protein, except 2-mM TCEP was added to 
all buffers and the protein concentrated to 25 mg/ml before 
crystallizations. 

Protein crystallization, data collection and structure refine- 
ment 

Native rMSMEG_2630-B (30 mg/ml) (in 20-mM Tris-HCl, 
pH 8.5, 25-mM NaCl, 5% glycerol, 2-mM MgCl 2 ) was crys- 
tallized by sitting drop vapor diffusion against crystalliza- 
tion buffer (100-mM HEPES, pH 7.0, 200-mM LiCl, and 
26% PEG 6000) at 24° C as well as 10°C. Needle-shaped 
crystals were obtained in 4-5 days at both temperatures. The 
crystals were further refined to larger diffraction quality 
crystals by microseeding. For microseeding, crystals were 
pooled from eight to ten crystallization drops and washed 
multiple times with the crystallization buffer. Protein crys- 
tals were then crushed with the help of crystallization mi- 
crotools (Hampton Research) and resuspended in 50 |ul1 of 
crystallization buffer and used as seed stock. Serial dilutions 
of the seed stock (1:100, 1:200, 1:400 and 1:1000) were pre- 
pared and used for crystallization. Large rod-shaped crys- 
tals were obtained after incubation for another 5-6 days 
with the seed dilutions of 1:400 and 1:1000. Se-Met deriva- 
tized rMSMEG_2630-B was also crystallized by microseed- 
ing using the same native seeds in crystallizations. The crys- 
tals were cryoprotected in a solution containing 15% glyc- 
erol in addition to the crystallization buffer and flash frozen 
into liquid nitrogen until data collection. 



Se-Met-incorporated crystal was subjected to a fluores- 
cence scan at the Se K edge on beamline BM14 under at- 
tenuation at the European Synchrotron Radiation Facility, 
Grenoble, and the data were analyzed with CHOOCH (12). 
Diffraction data were collected on the same crystal at Se- 
peak wavelength of 0.9795 A. The data were processed us- 
ing MOSFLM, data quality assessed (14) and scaled us- 
ingSCALA within the CCP4 package (15,16). The crystals 
belong to the space group PI 2\ 1 and contain two molecules 
in the crystallographic asymmetric unit. Details of data col- 
lection and data processing are summarized in Table 1 . 

X-ray structure determination 

The structure of rMSMEG_2630-B was solved by a single- 
wavelength anomalous dispersion (SAD) phasing method 
with data collected at the peak energy of X-ray absorption 
spectrum of selenium, using the automated structure deter- 
mination platform of AutoRickshaw (17). Further model 
building on the solution obtained was carried out in Coot 
(18) and refined in Refmac5 (19) and Buster 2.10.0 (20). 
Non-crystallographic symmetry restraints were released in 
order to capture domain movements in the different sub- 
units and translation/libration/screw (TLS) refinement (21) 
was performed during final stages with three TLS groups 
(residues 22-189, 190-240 and 241-340) for each subunit. 
Details of model building and refinement statistics are given 
in Table 1 . The final coordinates have been deposited with 
PDB with the accession code 4LS9. 

N-terminal sequencing 

N-terminal sequencing of rMSMEG_2630-B was carried 
out on a Procise 491 A protein sequencer (Applied Biosys- 
tems) by the Edman degradation method. The crystals were 
pooled from three to four crystallization drops and washed 
thoroughly with the crystallization buffer to remove any 
residual uncrystallized protein in the drop. The crystals were 
then crushed with the help of crystallization microtools and 
the resultant protein resuspended in 30 jxl of solubiliza- 
tion buffer (20-mM Tris-HCl, pH 8.5, 25-mM NaCl, 2-mM 
MgCl2 and 5% glycerol). The protein was then applied on 
a 15% SDS-PAGE and transferred to polyvinylidene diflu- 
oride (PVDF) membrane for Edman sequencing. 

Transposon mutagenesis and complementation 

A transposon mutant, LI 54, having disruption in 
MSMEGJ.630 in M. smegmatis mc 2 155 was isolated for 
genetic studies from the existing mini-mariner-disrupted 
transposon mutant library (22). The transposon mutant 
was confirmed by Southern blotting and the location of 
transposon insertion was identified by sequencing the 
transposon flanking region. 

MSMEGJ630 or rv2837c was PCR amplified from ge- 
nomic DNA of M. smegmatis mc 2 155 or M. tuberculo- 
sis H37Rv, respectively, using respective forward and re- 
verse primers (Supplementary Table SI). The PCR products 
were digested with PstI and Hindlll restriction enzymes 
and cloned in E. coli- Mycobacterium shuttle vector, pSMT3 
(23), to create pRPS16 and pRPS31, respectively. LI 54 cells 
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Table 1. Data collection and model refinement statistics 



Crystal and data collection 




Space group 


PI 2i 1 


Cell dimensions 


a = 65.87 A,b = 86.10 A,c= 66.03 A, a = 90°, b = 1 18.37, y = 90° 


Resolution (A) a 


34.59-2.20 


Number of unique reflections 


32833 (4744) 


Completeness (%) 


99.5 (99.2) 


Multiplicity 


/.o (/.o) 


^meas (%) b 


11 1 //I A 0\ 

11.1 (4U.5) 


_ ^7" / f _ J" \ _ C 

<<y>/cr(<y>)> c 


12.9 (5.3) 


Refinement 




\< Tt"t (~\T~\ 1 A 1 

ivcsuiULiuii y^-) 




Number of reflections working/test 


oooo i /i £££ 
jZoj 1 / 10DO 


^work (%)^ 


0.176 (0.183) 


*free (%f 


0.225 (0.243) 


Total protein residues/ atoms 1 


638/4779 


Total water molecules 


OAS 


Wilson B factor (A 2 ) 


31.7 


Average B factor (A 2 ) 




Protein atoms (chain) 


39.0 




39.3 


rms deviations from ideal 




Bond lengths (A) 


0.01 


Bond angles (°) 


1.1 


Ramachandran plot 




Most favored regions (%) 


97.9 


Disallowed regions (%) 


0.0 



a Numbers in parentheses correspond to the highest resolution shell. 
b ^meas as described in (14). 

°<<I>/a(<I>)> = mean 1^ over the standard deviation of the mean ih averaged over all reflections in a resolution shell. 

d Avork = \\F 0 \ - | F c || /^2\F Q \ f where \F 0 \ is the observed structure factor amplitude and |F C | is the calculated structure factor amplitude. 

e ^f ree : ^factor based on 5% of the data excluded from refinement. 

f Total number of protein atoms, including those in alternate conformations. 



were transformed with pRPS16 or pRPS31 by electropo- 
ration. The resultant transformants were used for comple- 
mentation and growth experiments described below 



Complementation and growth of M. smegmatis mutant LI 54 

Effect of deletion of MSMEGJ630 on growth in M. smeg- 
matis mutant LI 54 was monitored by OD600. Briefly, 1:500 
dilutions of freshly growing log-phase M. smegmatis mc 2 
155 (WT)/pSMT3, L154/pSMT3 and L154/pRPS16 cells 
were inoculated separately into 50-ml LB medium at 37°C. 
Aliquots of cultures were withdrawn after intervals of 12 
h and growth monitored by OD 60 o. To study the effect of 
oxidative stress and DNA-damaging agents, minimum in- 
hibitory concentration (MIC) was estimated for various ex- 
trinsic agents. For MIC estimation, M. smegmatis mc 2 155 
(WT)/pSMT3, L154/pSMT3 and L154/pRPS16 cells were 
grown in Middlebrook 7H9 liquid medium (supplemented 
with 0.05% Tween-80 and 0.2% glycerol) in the presence of 
increasing concentrations of H2O2, menadione, mitomycin 
C, ciprofloxacin, nalidixic acid, zeocin, hydroxyurea, ethyl 
methanesulfonate, methyl methanesulfonate and ethidium 
bromide (EtBr). The cell growth was monitored after 48 h 
by OD600 • MIC for DNA intercalating agent, EtBr, was also 
estimated as above on L154/pRPS31. 



Reverse transcriptase PCR reaction 

Total RNA from wild-type (WT) M. smegmatis was iso- 
lated using RNeasy kit (Qiagen). First strand complemen- 
tary DNA (cDNA) synthesis was carried out using 1-2- 
|xg total RNA and random hexamers as primers (Om- 
niscript kit, Qiagen). A 15-fxl aliquot of the cDNA syn- 
thesis reaction was amplified with primers For-RTl and 
Rev-RTl for MSMEGJ628 and MS MEG 2629 intergenic 
region, For-RT2 and Rev-RT2 for MSMEGJ629 and 
MSMEGJ2630 intergenic region and For-RT3 and Rev- 
RT3 for MSMEGJ630 and MSMEGJ631 intergenic re- 
gion. The oligonucleotide sequences are shown in Supple- 
mentary Table SI. For each sample, a reverse transcriptase 
(RT) negative control was also performed to rule out DNA 
contamination. RT-PCR positive control was carried out 
for sigma factor A, sigA. 

Enzyme activity assays 

Activity assays on nano-oligonucleotides were performed 
using custom-made RNA and DNA 5-mers labeled at their 
5' -end with sulfoindocyanine succinimidyl ester cyanine 5 
(Cy5) as described previously (5). A 10-fxl reaction mix 
containing 12.5-|xM Cy5-labeled oligos as substrate and 4- 
IxM rMSMEG_2630-A in 50-mM HEPES, pH 7.5, 5-mM 
MnCl2, was incubated at 37°C for indicated time intervals in 
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different vials. The reaction was stopped by heating at 95°C 
for 5 min and the reaction mixture was stored at — 20°C. 
For analysis of the reaction products, the samples were ap- 
plied on a 22% polyacrylamide gel (containing 8-M urea) 
and run in 1 x Tris/Borate/ethylenediaminetetraacetic acid 
(EDTA) electrophoresis buffer. Fluorescent oligos were vi- 
sualized using a Typhoon FLA 7000 (GE Healthcare) with 
excitation and emission at 635 nm and 670 nm, respectively. 
Quantification of the data was done by defining the total 
amount of fluorescence measured in the substrate (5-mer) 
and the reaction products (4-mer, 3-mer, 2-mer and 1-mer 
for ribonucleotides or 4-mer, 3-mer and 2-mer for DNA oli- 
gos) for each time point as 100% and expressing the amount 
of each reaction product as fraction of the total. 

Exonuclease activity of rMSMEG_2630-A was also an- 
alyzed with long synthetic radiolabeled oligonucleotides as 
substrates. 49-mer DNA oligos (Supplementary Table SI) 
5 / -end-labeled with [7 - 32 P]ATP by T4 polynucleotide kinase 
or 3 / -end-labeled with [a- 32 P]dATP by terminal deoxynu- 
cleotide transferase and a similarly 5 / -end-labeled 35-mer 
ribonucleotide (Supplementary Table SI) were used as sub- 
strates. Synthetic high performance liquid chromatography- 
(HPLC)-purified 5 / -end-labeled oligonucleotides of lengths 
5-mer, 10-mer, 20-mer, 30-mer, 40-mer and 50-mer were 
used as markers to analyze the length of degradation prod- 
ucts. The exonuclease activity assay was carried out by incu- 
bating (280 nM) of 5 / -end-labeled ssDNA or RNA with 2.8 
IxM of purified rMSMEG_2630-A in 20-mM Tris-HCl, pH 
7.5, 100-mM NaCl, 10-mM MgCl 2 and 1-mM dithiothre- 
itol (DTT) at 37°C for different time periods. The exonucle- 
ase activity was terminated by addition of 50-mM EDTA 
and heating the reaction mix at 95°C for 15 min. The prod- 
ucts were analyzed on 1 5% polyacrylamide gel (containing 
8-M urea) and visualized on a Phosphor-Imager (Bio-Rad). 

The phosphatase activity of rMSMEG_2630-A was esti- 
mated with 3 / -phosphoadenosine phosphate (pAp), sodium 
pyrophosphate and sodium triphosphate (Sigma Aldrich) 
as substrates. Briefly, 1-mM substrate (in 100-mM Tris-HCl, 
pH 8.3, 100-mM KC1 and 5-mM MgCl 2 ) was incubated 
with 250 nM of rMSMEG_2630-A at 37°C for 2 h. The re- 
lease of inorganic phosphate (Pi), due to phosphatase ac- 
tivity of the protein, was estimated by malachite green in- 
organic phosphate assay and monitored colorimetrically at 

^610- 

Phosphodiesterase activity of rMSMEG_2630-A was 
tested using bis-p-nitrophenol phosphate (bis-pNpp) as a 
substrate. The standard reaction included 1-mM substrate, 
100-mM Tris-HCl, pH 8.3, 100-mM KC1, 5-mM MgCl 2 and 
250 nM of purified protein. The reaction was incubated at 
37°C for 30 min and the release of p-nitrophenol as product 
was monitored colorimetrically at X 405. Activity assays of 
mutants (described below) and N-terminal domain (NTD) 
constructs were similarly estimated. 

Mutagenesis of conserved residues 

A conserved aspartate residue (D134) and a conserved his- 
tidine residue (H316) in pRPS25 were substituted with 
alanine by site-directed mutagenesis (Quik-change XL II 
Stratagene) as per manufacturer's instructions. The pres- 
ence of the desired mutation was confirmed by DNA se- 



quencing. Mutant plasmids thus obtained, pD134A and 
pH316A, were transformed into E. coli BL21(DE3). The 
mutant proteins MS-D134A and MS-H316A were ex- 
pressed and purified as the WT rMSMEG_2630-A. 

Circular dichroism spectroscopy measurements 

Far-ultraviolet circular dichroism (CD) spectra were col- 
lected on a Jasco J815 CD spectrometer in a quartz cu- 
vette with a path length of 0.1 cm at room temperature 
in the range of 190-250 nm. CD measurements were car- 
ried out in 5-mM Tris-HCl, pH 8.5, 5-mM NaCl, 0.25- 
mM MgCl 2 , 0.125-mM DTT and 1.25% glycerol at pro- 
tein concentrations of 0.6 mg/ml for MS-D134A and 
MS-H316A, 0.8 mg/ml for WT rMSMEG_2630-A, 0.25 
mg/ml for MS-NTD-A, MS-NTD-B and 0.3 mg/ml for 
Ntag_MSMEG_2630. Each spectrum was recorded as an 
average of four scans. In all experiments, contributions of 
the buffer to the spectra were subtracted, and mean residue 
ellipticities were determined before plotting the spectra. 

Sequence analysis 

Sequence homologs, as bidirectional best hits from Ba- 
sic Local Alignment Search Tool (BLAST) for the previ- 
ously characterized B. subtilis NrnA, from different bacte- 
rial groups were obtained from NCBI (http://ncbi.nlm.nih. 
gov/protein) (10). A phylogenetic tree was constructed using 
MEGA5.1 (24) with 1000 cycles of bootstrap for selected se- 
quences. Genetic location and neighboring genes were iden- 
tified from GenBank. 

RESULTS 

Sequence/phylogenetic analysis of NrnA 

A BLAST search with Rv2837c, described earlier 
as a nanoRNase in M. tuberculosis (10), identified 
MSMEG_2630 as the ortholog in M. smegmatis mc 2 155 
genome with an rvalue of 10" 168 . MSMEG_2630 consists 
of 340 amino acids and when compared with Rv2837c 
has sequence identity and similarity of 73% and 83%, re- 
spectively. MSMEG_2630 is the best bidirectional BLAST 
hit of NrnA in M. smegmatis genome, confirming it as 
the NrnA-ortholog in M. smegmatis. The identity and 
similarity of MSMEG_2630 with NrnA of B. subtilis are 
25% and 44%, respectively. 

A sequence alignment of nanoRNases from various or- 
ganisms revealed the presence of several conserved sequence 
features among all nanoRNase sequences examined, sug- 
gesting that these residues play important roles in its func- 
tion (Figure 1). A conserved 134 DHH 136 motif in the NTD 
and 313 GGGH 316 motif in the CTD, which are characteris- 
tic features of DHH/DHHA1 subfamily, are also conserved 
in MSMEG_2630 (Figure 1). In addition, actinobacterial 
NrnA sequences possess a short stretch of highly variable 
residues at their N-terminus not observed in any other se- 
quences (Figure 1). 

An examination of genomic loci of nrnA and 
MSMEG^2630, however, revealed striking differences 
in patterns of neighboring genes of nanoRNase homologs. 
Translation initiation factor 2 (infB) (MSMEGJ2628) and 
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Figure 1. Multiple sequence alignment of NanoRNases. Multiple se- 
quence alignment of selected nanoRNase sequences (Msm: M. smegma- 
tis MSMEG_2630; Mtu: M. tuberculosis Rv2837c; Cgl: Cory neb acterium 
glutamicum WP_003861692; Bsu: B. subtilis NrnA (Ytql); Gsp: Geobacil- 
lussp. YP_001 126761; Sma: Streptococcus macedonicus YP_005094369 and 
Pmu: Paenibacillus mucilaginosus YP_004644163). The conserved DHH 
motif in the NTD and GGGH motif in the C-terminal DHHA1 domain 
are shown in rectangular boxes. The mycobacterial sequences contain a 
unique stretch of residues at the N-terminus (see the text for details). 



ribosome-binding factor A (rbfA) {MSMEGJ.629) are 
upstream neighboring genes of MSMEGJ.630 and appear 
to share a transcription start site with MSMEG2630 
(Figure 2a). A gene encoding a multidrug and toxic com- 
pound extrusion protein (M^4r£)/DNA-damage-inducible 
protein F (dinF) {MSMEG2631) present downstream of 
MSMEG2630 also appears to be transcribed from the 
same transcription start site. While MATE is a neigh- 
boring gene of mycobacterial sequences only, nusA, a 
transcriptional regulator present further downstream of 
MSMEGJ2630, is also commonly found in this locus in all 
the examined sequences. On the other hand, dnaE, acetyl 
CoA carboxylase (accA), 6-phosphofructokinase (pfkA) 
and pyruvate kinase (pyk) are the frequently observed 
neighbors of nrnA of B. subtilis (Figure 2b). A phylogenetic 
tree constructed for nanoRNase sequences showed clear 
clustering into two major clades, with infB, rbfA and nusA 
as neighboring genes in one clade and dnaE, accA, pfkA 
and pyk as neighboring genes in the other (Figure 2c). 
MSMEG_2630, like Rv2837c, is distinct from B. subtilis 
NrnA and is part of a separate clade (Figure 2c). We 
propose that members of this separate clade be designated 
NrnA'. The nanoRNase activity of MSMEG_2630, a 
member of NrnA 7 , may have specific roles in regulation 
of RNA-mediated transcriptional and/or translational- 
related activities and was investigated further during the 
course of this work. 



MSMEG_2630 is transcribed with infB and rbfA as a single 
operon 

The genetic organization of MSMEGJ2630 in M. smeg- 
matis mc 2 155 predicts that MSMEG2630 locus would be 
transcribed as an operon with additional members. To con- 
firm the transcription of MSMEGJ2630 as a single operon 
with these members and explore the possible physiological 
role of MSMEG2630, the genie boundaries were identi- 
fied by RT-PCR. The amplification with intergenic primer 
pairs for each junction between two genes demonstrated 
that expression of MSMEGJ.630 is transcriptionally linked 
to two upstream (MSMEGJ628, infB and MSMEGJ629, 
rbfA) and one downstream genes {MSMEG2631, dinF) 
(Figure 2d) confirming that these genes are transcribed as 
a polycistronic mRNA. This polycistronic message encod- 
ing MSMEG_2630 (NrnA) along with regulators of trans- 
lation is suggestive of related function. MATE efflux family 
protein {MSMEG2631) has been shown to be involved in 
stress tolerance in mycobacteria (25). NanoRNase function 
of MSMEG_2630 may hence be linked to stress tolerance in 
mycobacteria. 

MSMEG 2630 mutant is growth defective 

The transposon mutant, LI 54, with a single mini-mariner 
transposon insertion at 441 bp of MSMEGJ2630 (this 
study) was used to probe the role of MSMEG_2630 in 
growth of cells. Disruption of MSMEG2630 was not lethal 
but had a slight effect on growth when the mutant was 
grown in LB medium (Figure 3a). At a 48-h interval, there 
was a growth difference of ~1 OD600 between WT and 
LI 54. Constitutive expression of MSMEG_2630 from a 
multicopy plasmid (pRPS16) under hsp60 promoter re- 
stored growth of the mutant. This is somewhat different 
from the ortholog (Rv2837c), which has been previously re- 
ported to be essential for growth of M. tuberculosis H37Rv 
(26,27). 

Role of MSMEG_2630 in survival under stress conditions 

In order to investigate the survival of MSMEG_2630 dis- 
ruptant under DNA-damaging conditions, LI 54 cells were 
grown with different concentrations of oxidizing and DNA- 
damaging agents such as hydrogen peroxide, EtBr, mena- 
dione and mitomycin C (Figure 3b-e). Expression of 
MSMEG_2630 from pRPS16 restored growth of the mu- 
tant in all cases, confirming that the phenotype was due to 
the disruption of MSMEG_2630 and not due to polar effect. 
Other agents, namely, ciprofloxacin, nalidixic acid, zeocin, 
hydroxyurea, ethyl methanesulfonate and methyl methane- 
sulfonate had no effect on growth of the mutant (data not 
shown). 

LI 54 cells transformed with pRPS31, carrying the M. tu- 
berculosis ortholog rv2837c could also complement the dele- 
tion of MSMEG2630 when tested for potential DNA dam- 
age with EtBr (Figure 3c). 

MSMEG_2630 is a 3'-5' exonuclease 

Purified recombinant MSMEG_2630 with an N-terminal 
His-tag (rMSMEG_2630-A) was tested for nanoRNase ac- 
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Figure 2. Synteny conservation and phylogeny of nanoRNases are indicative of diverse families. Gene neighbors of (a) MSMEGJ630 and (b) B. subtilis 
NrnA show two distinct groups based on synteny conservation. The location of insertion of mini-mariner transposon (Tn) at 441 bp of MSMEGJ2630 is 
indicated with a triangle. The indicated gene sizes are suggestive and not drawn to scale, (c) Phylogenetic tree for selected bacterial nanoRNases (NrnA, 
NrnB and NrnC). (d) RT-PCR confirms the presence of the indicated neighbors of MSMEG_2630 in a single operon. A 1 5-jul1 aliquot of the cDNA syn- 
thesis reaction was amplified with primers for MSMEGJ628 and MSMEGJ629 intergenic region (lane 3), MSMEGJ629 and MSMEGJ630 intergenic 
region (lane 4), and MSMEGJ630 and MSMEGJ631 intergenic region (lane 5). An RT negative control (lane 1) was also performed to rule out DNA 
contamination and RT-PCR positive control was carried out for sigma factor A, sigA (lane 2). Lane M contains a standard DNA ladder (100 bp). 



tivity. The nanoRNase activity was tested on different 5'- 
Cy 5 -labeled 5-mer RNA oligos, varying in their sequences, 
namely, yCy5-rCrArCrCrA, 5 / Cy-rCrCrCrCrC and 5'Cy- 
rArArArArA. As shown in Figure 4a, rMSMEG_2630-A 
could degrade all three 5-mer nanoRNAs tested. Turnover 
rates were measured as disappearance of the initial 5-mer 
substrate. As the initial rates of conversion of 5-mer were 
linear for a short period (less than 1 min), the turnover 



rates provide only an approximate rate of conversion. The 
turnover rates, however, did not show any significant differ- 
ence in sequence preference and were calculated to be in the 
range of 32-44 pmol/|xg/min for the three different nanoR- 
NAs (Figure 4b). The turnover rates for disappearance of 
the 5-mer RNA are similar to those reported for M. tuber- 
culosis homolog Rv2837c (9-28 pmol/|xg/min) (10). Like 
Rv2837c, 2-mer was found to be a preferred substrate, as 
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Figure 3. Role of MSMEG_2630 in bacterial growth and stress tolerance, 
(a) Growth of mutant L154/pSMT3 (open triangle) is slow as compared to 
wild-type M. smegmatis /pSMT3 (WT) (filled circles). L154/pRPS 16 (filled 
squares) has growth similar to WT. Mutant L154/pSMT3 is sensitive to 
stress from external agents, namely, (b) H2O2, (c) EtBr, (d) menadione and 
(e) mitomycin C. The stress is relieved by complementation and growth is 
restored to that of WT cells in all cases. L154/pSMT3 (open triangles), M. 
smegmatis /pSMT3 (WT) (filled circles) and L154/pRPS16 (filled squares). 
The growth defect in mutant LI 54 by EtBr (C) could also be complemented 
by rv2837c of M. tuberculosis, L154/pRPS31 (open diamonds). 

this intermediate was missing and could not be trapped in 
any of the reactions (Figure 4a). 

The activity of rMSMEG_2630-A was also tested on 
DNA oligos (Figure 4c). rMSMEG_2630-A was able to 
degrade ssDNA oligos, namely, 5 / Cy5-CACCA, 5'Cy5- 
CCCCC and 5'Cy-AAAAA without any sequence prefer- 
ence with a 20- to 40-fold slower rate of turnover (calcu- 
lated between 1.1 and 1.7 pmol/jxg/min) in comparison to 
nanoRNA, when measured as disappearance of the initial 
5-mer substrate (Figure 4d). In addition, no preference was 
observed for shorter ssDNA oligos. 

We then probed whether rMSMEG_2630-A could de- 
grade longer substrates as well. rMSMEG_2630-A exhib- 
ited in vitro exonuclease activity on both a 49-mer ssDNA 
(Figure 5a and b) and a 3 5-mer RNA (Figure 5a) in the 3'- 
5' direction. No detectable activity could be identified on a 
49-mer double- stranded DNA molecule (data not shown), 
indicating rMSMEG_2630-A acts only on single-stranded 
oligonucleotides as substrates. Next, we probed the in vitro 
phosphodiesterase activity of rMSMEG_2630-A. Phospho- 
diesterase activity of rMSMEG_2630-A was tested with bis- 
pNpp as a substrate was found to be 16.1 pmol/|xg/min 
(Figure 6a). 

rMSMEG_2630-A shows in vitro pAp phosphatase activity 

Purified rMSMEG_2630-A was tested for its CysQ-like 
phosphatase activity on pAp. rMSMEG_2630-A could hy- 
drolyze pAp in vitro, resulting in release of inorganic 
phosphate and the specific activity calculated to be 20.9 
pmol/|xg/min (Figure 6b). rMSMEG_2630-A appears to be 
highly selective in recognition of this small substrate as no 
in vitro phosphatase activity could be detected with sodium 
pyrophosphate or sodium triphosphate under our experi- 
mental conditions. 
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Figure 4. 3' -5' exonuclease activity of rMSMEG_2630-A on Cy5-labeled 
short oligo substrates. rMSMEG_2630-A catalyzed degradation of (a) 
Cy5-labeled 5-mer nanoRNA or (c) Cy5-labeled 5-mer ssDNA (as indi- 
cated). Reaction products were separated on a 22% polyacrylamide gel 
(containing 8-M urea). Cy5 oligos exhibit a reverse migration phenomenon 
(5). This effect is due to cyanine dyes having a lower net negative charge 
than nucleic acids; thus, removing nucleotides reduces the charge relative to 
the mass of the oligonucleotide and causes it to shift up instead of down. 
The different gel mobility as compared with RNA is caused by the one 
missing charge of DNA as DNA monomer cannot enter the gel. M indi- 
cates a size marker obtained by an E. coli Orn catalyzed reaction in all pan- 
els. Background bands observed in some panels were ignored for calcula- 
tion of turnover rates. Quantification of (b) nanoRNA or (d) ssDNA reac- 
tion products and degradation intermediates. Open circles, closed squares, 
closed triangles, crosses and open diamonds indicate 5-mer, 4-mer, 3-mer, 
2-mer and 1-mer, respectively. 




Figure 5. Exonuclease activity of rMSMEG_2630-A on longer substrates, 
(a) In vitro exonuclease activity of rMSMEG_2630-A on 49-mer ssDNA 5'- 
end-labeled with py- 32 P] ATP was monitored at 0, 5, 10, 15, 30, 45, 60, 90, 
120, 240 or 360 min (lanes 2-12) and 35-mer RNA, 5'-end-labeled with [7- 
32 P] ATP (lanes 13-23). Lane 1 is a control reaction (without enzyme), (b) 
The in vitro exonuclease activity of rMSMEG_2630-A on 49-mer ssDNA 
3 / -end-labeled with [a- 32 P] dATP (lanes 2-7) monitored at 0, 15, 30, 45, 
60, 120 confirms the 3 / -5 / direction of degradation by the enzyme. Lane 1 
is control reaction without rMSMEG_2630-A. 



7902 Nucleic Acids Research, 2014, Vol 42, No. 12 




(b) 25 




WT MS-D134A MS-H316A 



Figure 6. Phosphodiesterase and CysQ-like phosphatase activity of 
rMSMEG_2630-A and mutant proteins, (a) Phosphodiesterase activ- 
ity (unfilled bars) or (b) CysQ-like pAp phosphatase (filled bars) of 
rMSMEG_2630-A, MS-D134A and MS-H316A was monitored with 1- 
mM bis-pNpp or 1-mM pAp as substrates, respectively, and specific activ- 
ity was plotted (see the text for more details). 



Overall structure 

In order to understand the molecular basis of selective 
recognition of nanoRNA by MSMEG_2630 and the archi- 
tecture of its constituent domains, the structure of recom- 
binant tag-free MSMEG-2630 (rMSMEG_2630-B) was de- 
termined to 2.2-A resolution by Se-Met SAD method and 
refined to final R WO rk of 0.176 and Rf rQQ of 0.225 (Table 1). 
There are two molecules in the asymmetric unit forming 
a homodimer of identical subunits (Figure 7). The qua- 
ternary structure of rMSMEG_2630-B was also confirmed 
by size-exclusion chromatography (Supplementary Figure 
SI). The overall structure of each subunit can be divided 
into two distinct domains, an NTD comprising residues 22- 
206 and a shorter CTD comprising residues 224-340. The 
two domains are joined by a short linker region consisting 
of residues 207-223. The unique N-terminal stretch in my- 
cobacterial sequences (Figure 1) comprising residues 1-21 
of rMSMEG_2630-B was found to be disordered in both 
subunits; their presence in the protein crystal confirmed by 
N-terminal sequencing. The final model hence consists of 
319 residues in each subunit. 



Figure 7. Overall structure. The overall structure of MSMEG_2630 indi- 
cates a homodimer with each subunit consisting of two distinct domains: 
the N-terminal DHH domain and the C-terminal DHHA1 domain. The 
two subunits are shown in different colors. A Mn 2+ bound at catalytic cen- 
ter is shown as a sphere (magenta) while the conserved residues involved 
in metal coordination are shown in stick mode. The N- and C- termini are 
labeled. 

The NTD of rMSMEG_2630-B consists of a twisted five- 
stranded parallel (3 -sheet with a-helices on both sides (Fig- 
ure 7 and Supplementary Figure S2A). The topology resem- 
bles the six-stranded parallel (3-sheet of nucleotide binding 
a/ (3 -fold or Rossman-fold with differences in the crossover 
helices as also confirmed by Dali (for instance, a Z-score 
of 10.6 and rmsd of 2.8 A for 133 C a atoms were ob- 
tained for a proton-translocating transhydrogenase (PDB 
ID: 1DJL)). The NTD harbors the conserved 134 DHH 136 
motif and hence designated the DHH domain. One Mn 2+ 
was located in the electron density map, coordinated to 
conserved residues in the DHH domain. The CTD con- 
sists of a central mixed (3 -sheet surrounded by a-helices 
on both sides. The CTD harbors conserved residues of 
the DHHA1 motif and designated the DHHA1 domain of 
rMSMEG_2630-B. In the following discussion in the rest 
of this manuscript, the NTD and CTD are hence referred 
to as DHH and DHHA1 domains, respectively. The aver- 
age B-factor of the DHHA1 domain (49.8 A 2 or 41.6 A 2 in 
the two subunits) is higher than that of the DHH domain 
(27.9 A 2 or 34.6 A 2 ) in both subunits, suggestive of larger 
flexibility in the DHHA1 domain. The DHH or DHHA1 
domain in one subunit has similar conformation to the cor- 
responding domain in the other subunit with low rmsd val- 
ues of 0.29 A (185 C a atoms) and 0.30 A (1 14 C a atoms), re- 
spectively (Supplementary Figure S2B and C). However, su- 
perposition of complete subunits (rmsd: 2.59 A for 295 C a 
atoms) reveals different orientations, indicating both DHH 
and DHHA1 domains can move relative to each other due 
to flexibility around the linker region, and provides multiple 
snapshots of the enzyme (Supplementary Figure S2D). 

Active site 

The active site of MSMEG_2630 is located at the domain 
interface and defined by an Mn 2+ coordinated to conserved 
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residues in the DHH domain. Mn 2+ is present in an octahe- 
dral geometry coordinated to Asp-51, His- 135 and Asp- 185 
of the protein. In addition, Asp- 1 10 is coordinated to Mn 2+ 
in a bidentate fashion along with two water molecules at dis- 
tances of 2.33 A and 2.75 A to complete the coordination 
state (Figure 8a). Although Mn 2+ is generally present in an 
octahedral geometry with six-coordination state, the seven- 
coordination state (with a bidentate coordination with at 
least one residue) has been previously seen for Mn 2+ in 
several hydrolases (PDB ID: 1D3V) (28) and phosphatases 
(PDB ID: 1G5B) (29) in PDB. 

The catalysis by DHH phosphoesterases including NrnA 
(8), RecJ (30) or pyrophosphatases (31-33) follows a two- 
metal ion mechanism. However, we could identify only one 
Mn 2+ in each subunit in our apo structure. In the absence 
of a bound substrate, the position for the second metal ion 
is not occupied, suggesting that the second metal ion is not 
structurally required and may bind during catalysis. 

The presence of Mn 2+ in the structure was somewhat sur- 
prising as Mg 2+ was used in protein purification and crys- 
tallization. An intrinsic bound Mn 2+ in the protein active 
site is not exchanged despite excess of Mg 2+ in the buffers in 
pyrophosphatases as well (32). The coordination geometry 
of both Mg 2+ and Mn 2+ is very similar and their coordina- 
tion distances differ only by 0.1 A and measured accurately 
in our high-resolution structure in agreement with Mn 2+ 
rather than Mg 2+ coordination distances (34). The metal 
ion in the crystal structure was further confirmed as Mn 2+ 
through several other lines of evidence. Firstly, the local B- 
factors during refinement were found to be closer to the 
neighboring residues for a modeled Mn 2+ (B-factor: 30.1 or 
36. 1 A 2 in the two subunits) than a modeled Mg 2+ (B-factor: 
10.1 or 12.1 A 2 ; Supplementary Figure S3). Moreover, the 
coordination state of Mn 2+ in the active site of Family II 
pyrophosphatases has been suggested to change from five 
to six upon substrate binding. Varied coordination states 
are possible for transition metals viz, Mn 2+ but not Mg 2+ 
(33,35). Mn 2+ is hence conducive for catalysis due to its al- 
terations between different coordination states, necessary 
for catalysis (33). Thirdly, Mn 2+ has been suggested to ac- 
commodate bidentate coordination by carboxylate groups 
(Asp- 110 of MSMEG_2630) better than Mg 2+ due to its 
larger ionic radius (32). Lastly, Mn 2+ is a stronger Lewis 
acid than Mg 2+ and is likely to activate a water molecule 
more effectively for a nucleophilic attack for catalysis (32). 
Activity assays indicate that although Mn 2+ was the pre- 
ferred ion for catalysis at low concentrations (Figure 8b), 
at saturating concentrations, both metal ions can substi- 
tute each other for exonuclease activity in vitro. However, 
Ca 2+ , Ni 2+ , Co 2+ , Zn 2+ and Cu 2+ could not serve as cofac- 
tors when used at saturating concentrations and exhibit no 
activity (Figure 8c). 

Role of DHH motif. DHH motif is conserved in all mem- 
bers of DHH superfamily. However, not all of the conserved 
residues of the motif are involved in metal coordination. 
While His-135 from the conserved 134 DHH 136 motif (or the 
equivalent histidine of the motif in other structures) is di- 
rectly involved in metal coordination, the role of the other 
two residues is unclear. Examination of the structure sug- 



gested Asp- 134 and His- 136 to be structurally important in 
arranging the active site for catalysis. 081 atom of Asp- 134 
is hydrogen bonded to N81 of His- 136 and positions His- 
135 for coordination to Mn 2+ . Any change in this region 
is likely to perturb the coordination state of Mn 2+ and af- 
fect activity. A similar arrangement of the conserved aspar- 
tate with respect to the other conserved histidine of DHH 
motif to stabilize the bimetal catalytic center has been seen 
earlier in pyrophosphatases (31,33) but not other members 
of the DHH superfamily. The role of conserved aspartate 
was probed with a D134A mutant of MSMEG_2630. Mu- 
tation of D134A in MS-D134A led to loss of both exonu- 
clease (Figure 8d) and pAp phosphatase activity (Figure 
6b) in vitro. A CD spectrum of MS-D134A (Supplementary 
Figure S4A) is similar to that of WT rMSEG_2630-A and 
confirmed the overall folded state of the protein. Any po- 
tential local perturbations in MS-D134A structure appear 
to be minor and possibly limited to the active site and not 
directly discernible in the CD spectra, suggesting no other 
major structural changes took place in MS-D134A. 

Structural comparisons 

A search for structural homologs was carried out with the 
final coordinates of rMSMEG_2630-B dimer with Dali (36) 
and reveals hits from all the major members of DHH su- 
perfamily, namely, (i) NrnA (DHH/DHHA1 subfamily), 
(ii) RecJ exonucleases (also belonging to DHH/DHHA1 
subfamily) and (iii) metal-dependent inorganic pyrophos- 
phatases (belonging to DHH/DHHA2 subfamily). The 
closest structural homologs of MSMEG_2630 are B. frag- 
ilis nanoRNase structures (PDB ID: 3W5W, Z-score 29.6, 
rmsd: 4.5 A for 340 C a atoms and PDB ID: 3DMA, Z- 
score 30.1, rmsd: 4.2 A for 340 C a atoms). Staphylococcus 
haemolyticus exopolyphosphatase (PDB ID: 3DEV, Z-score 
28.0, rmsd: 3.2 A for 316 C a atoms), an uncharacterized 
protein from Northeast Structural Genomics Consortium, 
also shows high Z-score and appears to belong to NrnA 
family. 

The overall structure and arrangement of subunits in 
rMSMEG_2630-B are quite similar to the GMP-bound 
nanoRNase structure of B. fragilis (PDB ID: 3W5W) 
(Figure 9). The superposition of individual subunits as 
given by Dali, however, is poor. In addition, a short 
stretch of residues at the N-terminus (residues 22-27 in 
MSMEG_2630 and 2-11 in B. fragilis nanoRNase) does 
not align in the structures, indicating variability in the N- 
terminal region of nanoRNases. A closer examination of 
individual subunit structures indicates that the CTD in 
GMP-bound 3W5W moves upward toward NTD short- 
ening the cavity at the domain interface (Figure 9a). In 
MSMEG_2630, the cavity at the domain interface remains 
in a relatively open conformation resulting in large devia- 
tions when aligning the structures of MSMEG_2630 with 
B. fragilis nanoRNase. In the MSMEG_2630 apo struc- 
ture, the distance between H316 of the DHHA1 motif 
( 313 GGGH 316 ) and Mn 2+ is 14.6 A while in the ligand- 
bound 'closed conformation' in 3W5W, this distance is 6.6 
A. The individual domains, however, are superposed much 
better (Figure 9b and c) with rmsd of 1.61 A (173 C a 
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Figure 8. Role of metal ion. (a) Stereo view of Mn 2+ coordinated at the active site. The residues involved in coordination, namely, Asp51, Aspl 10, His 135 
and Aspl 85 are conserved across all nanoRNase sequences (see Figure 1 and text for details). In addition, two water molecules (Watl and Wat2) are 
present in the active site to complete the coordination state of Mn 2+ . The manganese ion is shown as a purple sphere while water molecules are shown 
as red spheres. A 2Fo-Fc map generated at 1 .0 sigma level indicates a good fit. (b) Exonuclease activity of rMSMEG_2630-A as a function of increasing 
concentrations of Mn 2+ (filled bars) or Mg 2+ (unfilled bars) as indicated, (c) Exonuclease activity of rMSMEG_2630-A on a 3 / -P 32 -end-labeled ssDNA 
(49-mer) in the presence of 10 mM of indicated metal ion (lane 1: Mg 2+ ; lane 2: Ca 2+ ; lane 3: Mn 2+ ; lane 4: Ni 2+ ; lane 5: Co 2+ ; lane 6: Zn 2+ ; lane 7: 
Cu 2+ ). No extraneous metal ion was added in the control reaction (lane C). (d) An activity assay with MS-D134A (lane 2) or MS-H316A (lane 3) on 
3'-P 32 -end-labeled ssDNA (49-mer) showed no detectable exonuclease activity. Lane 1 is an assay with rMSMEG_2630-A as a positive control while lane 
C is a 'no-protein' negative control. 



atoms) and 1.62 A (110 C a atoms) for N-terminal DHH 
and C-terminal DHHA1 domains, respectively, confirming 
thereby that larger deviations in the overall structures of 
rMSMEG_2630-B and B. fragilis nanoRNase are due to 
domain closure in the GMP-bound state. Overall similar- 
ities in the individual domains and conservation of con- 
served residues in equivalent positions, however, suggest 
that NrnA and NrnA' members bind to and degrade nu- 
cleotides through similar mechanisms. 

A Dali search also revealed structural similarities be- 
tween MSMEG_2630 and RecJ exonucleases (PDB ID: 
1IR6 and 2ZXP) (30,37), both members of DHH/DHH A 1 
subfamily. The structures of the N-terminal DHH domain 
and the C-terminal DHHA1 domain of both proteins are 
very similar to each other and superpose well with rmsd of 
1.82 A (for 185 C a atoms) and 2.83 A (for 104 C a atoms), 
respectively (Supplementary Figure S5). However, while all 
other top structural homologs consist of biological dimers, 
RecJ family exists as a monomer. An additional OB-fold do- 
main present in RecJ (30) but absent in nanoRNases confers 
binding ability to longer oligonucleotide substrates. 

Differential subunit packing in DHH/DHHA1 and DHH/ 
DHHA2 subfamilies 

The third major family showing structural similarities with 
MSMEG_2630 (belonging to DHH/DHHA1 subfamily) is 
family II pyrophosphatases and polyphosphatases (belong- 
ing to DHH/DHH A2 subfamily) (Supplementary Table 
S2). Structural comparison of MSMEG_2630 with Strep- 



tococcus mutans family II pyrophosphatase (32) provides 
interesting insights. Both MSMEG_2630 NrnA 7 and the 
S. mutans pyrophosphatase form homodimers with each 
monomer folding into two distinct domains. Despite vari- 
able sequences in the N-terminal DHH and C-terminal 
DHHA1 or DHHA2 domains of MSMEG_2630 and S. 
mutans pyrophosphatase (PDB ID: 1174), respectively, in- 
dividual subunits (rmsd of 2.92 A for 309 C a atoms) and 
domains (rmsd of 2.19 A for 146 C a atoms of N-terminal 
DHH domain and 2.80 A for 93 C a atoms of C-terminal 
DHHA1 versus DHHA2 domain) of these two subfamilies 
have similar folds and show overall similar structures (Fig- 
ure 10a and Supplementary Figure S6). However, there is a 
striking difference in the subunit arrangement in dimers in 
these two classes of proteins (Figure 10b and c). The fam- 
ily II pyrophosphatase homodimer is formed by the two N- 
terminal (DHH) domains packing together making a hy- 
drophobic dimer interface of two antiparallel (3 -strands, 
one contributed by each monomer. Together, the six (3- 
strands from each N-terminal DHH domain, hence, form 
a single continuous 12-stranded, twisted (3-sheet running 
through the protein (31). In MSMEG_2630, the terminal a- 
helix of the DHH domain (residues 194-205) and the first 
a-helix of the DHHA1 domain (residues 227-234) apart 
from the linker region form the dimer interface through in- 
teractions with corresponding regions of the other subunit 
and bury 1985 A 2 area at the dimer interface. In contrast 
to S. mutans pyrophosphatase, MSMEG_2630 subunits are 
oriented in a perpendicular direction to each other (Figure 
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Figure 9. Structural comparison of MSMEG_2630 with B. fragilis nanoRNase. (a) Stereoview of superposition of MSMEG_2630 (subunits in green and 
cyan) with nanoRNase of B. fragilis (PDB ID: 3W5W) (red) indicates CTD of B. fragilis nanoRNase is slightly shifted toward catalytic center. A bound 
Mn 2+ (magenta sphere) marks the active site of MSMEG_2630. Superposition of (b) the N-terminal DHH domain and (c) the C-terminal DHHA1 domain 
alone of the two proteins shows better alignment indicating overall similar structures of the individual domains. 



10b and c). This orientation presents two independent ac- 
tive sites on opposite faces. 

The linker region in pyrophosphatases is relatively small 
when compared to nanoRNases (Figure 10a). Conse- 
quently, the cavity in rMSMEG_2630-B appears to be 
enlarged when compared to pyrophosphatases. Although 
rigid body domain movements reveal an 'open' conforma- 
tion in the active site of B. subtilis pyrophosphatase (31), the 
enlarged cavity between the DHH and DHHA1 domains 
of rMSMEG_2630-B may be able to accommodate com- 
paratively larger substrate (oligonucleotides) as compared 
to pyrophosphatases even in the 'closed' state. Hence, de- 
spite overall structural similarities in the two domains of 
rMSMEG_2630-B and pyrophosphatases, rMSMEG_2630- 
B may be able to utilize DHH domains in an alternate sub- 
unit arrangement in contrast to pyrophosphatases to reveal 
a completely different quaternary structure of the two pro- 
teins (Figure 10b and c). 



Role of DHH and DHHA1 domains 

In order to investigate the role of DHH and DHHA1 
domains, we made constructs expressing individual do- 
mains. Deletion of the DHHA1 domain in MS-NTD-A 
(1-188) and MS-NTD-B (1-211) resulted in loss of in 
vitro exonuclease activity on synthetic 49-mer ssDNA sub- 
strates (Figure 11a). A CD spectrum of MS-NTD-A and 
MS-NTD-B (Supplementary Figure S4B) confirmed the 
overall folded state of the protein. However, MS-NTD- 
B exhibited both pAp phosphatase and phosphodiesterase 
activity in vitro (Figure lib and c) suggesting that the 
DHH domain is capable of catalysis on its own but re- 
quires the DHHA1 domain (212-340) for binding to longer 
oligonucleotide substrates. Both the phosphodiesterase ac- 
tivity of MS-NTD-B (8.9 pmol/|xg/min) and pAp phos- 
phatase activity (7.7 pmol/|xg/min) were similar to that 
of full-length Ntag_MSMEG_2630 (9.3 nmol/|xg/min and 
7.8 pmol/|xg/min, respectively). The absence of pAp phos- 
phatase and phosphodiesterase activity in MS-NTD-A was 
somewhat peculiar. Asp- 185 of MSMEG_2630 is required 
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Figure 10. Structural comparison of MSMEG_2630 with Class II pyrophosphatases, (a) Stereoview of superposition of MSMEG_2630 (cyan) with S. 
mutans pyrophosphatase (PDB ID: 1174) (orange). The longer linker region in MSMEG_2630 enlarging the cavity at the domain interface to enable 
binding of larger oligonucleotide substrates is indicated. Cartoon representation of subunit arrangement in dimers of (b) MSMEG_2630 and (c) family II 
pyrophosphatases. 
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WT MS-NTD-A MS-NTD-B 



Figure 11. Role of the N-terminal DHH domain of MSMEG_2630. 
(a) Exonuclease activity was monitored on ssDNA (lanes 1-4) or ss- 
RNA (lanes 5-8). Deletion of CTD leads to loss of exonuclease activ- 
ity on 3'-P 32 ssDNA or RNA in MS-NTD-A (lane 3 and lane 7) as 
well as MS-NTD-B (lane 4 and lane 8). Lanes 1 and 5: control reac- 
tions without MSMEG_2630; lanes 2 and 6: Ntag_MSMEG_2630. (b) 
Phosphodiesterase activity (unfilled bars) or (c) CysQ-like pAp phos- 
phatase (filled bars) of MS-NTD-A and MS-NTD-B was monitored with 
1-mM bis-pNpp or 1-mM pAp as substrates, respectively, and specific 
activity was plotted (see the text for more details). WT is reaction with 
Ntag_MSMEG_2630. 



for coordination of Mn 2+ and maintenance of the active site 
(Figure 8a). Asp- 185 is present close to the C-terminus of 
MS-NTD-A construct, which consists of 188 residues. We 
presume rearrangements in the C-terminus of MS-NTD- 
A affected the coordination state of the bound metal ion, 
thereby leading to loss of activity even on small substrates, 
unlike MS-NTD-B. 



C-terminal DHHA1 domain of MSMEG_2630 is devoid 
of known active site residues. In order to explore the role 
of DHHA1 domain, a Dali search with the C-terminal 
DHHA1 domain alone was carried out. Dali search shows 
top hits from CTDs of DHH subfamilies (NrnA, pyrophos- 
phatases and RecJ). Structural similarity was also observed 
with the C-ala domain of alanyl t-RNA synthetase involved 
in single-stranded nucleotide binding (PDB ID: 3G98, Z- 
score: 8.8, rmsd: 2.8 A for 111 C a atoms) (38). The over- 
all structural similarity of DHHA1 domain with the C- 
ala domain suggests its role in binding to longer single- 
stranded oligonucleotide substrates as suggested earlier for 
both RecJ (38) and NrnA exonuclease (8). However, no de- 
tectable binding to ssDNA or RNA could be identified with 
the DHHA1 domain (MS-CTD) under our experimental 
conditions (data not shown). 

The role of DHHA1 domain and the residues therein 
was hence investigated by mutagenesis of H316 in the con- 
served 313 GGGH 316 motif of full-length protein. An activ- 
ity assay with MS-H316A showed no phosphatase activity 
on pAp (Figure 6b) or exonuclease activity (Figure 8d). In 
addition, the specific activity with bis-pNpp as a substrate 
was found to be 10.1 pmol/|xg/min, an ~40% loss in the in 
vitro phosphodiesterase activity (Figure 6a). A structurally 
equivalent His311 of B. fragilis nanoRNase has been pro- 
posed earlier to be a likely residue in substrate recognition 
(8). A GMP ligand binds close to the conserved His311 of 
B. fragilis nanoRNase structure, although no direct interac- 
tions were observed between GMP and residues in the motif 
(8). Mutation of equivalent conserved histidine resulted in 
decreased exonuclease activity in E. coli RecJ as well (39). 
A complete loss of exonuclease activity in the MS-H316A 
in contrast to E. coli RecJ could be due to an additional 
OB-fold domain available in RecJ for oligonucleotide sub- 
strate binding. Like MS-NTD-B, MS-H316A appears to re- 
tain phosphodiesterase activity on bis-pNpp as it contains 
all the catalytic residues. We could, however, not explain the 
loss of pAp phosphatase activity of MS-H316A. 

DISCUSSION 

Members of DHH superfamily include 5 / -3 / exonu- 
cleases [RecJ (30,37)], nanoRNases [NrnA, (5,6,8,10)], 
cyclic nucleotide phosphodiesterases [YybT in B. sub- 
tilis, (40)] and eukaryotic prune proteins (41), or phos- 
phatases [exopolyphosphatase, family II pyrophosphatase 
(31-32,35,42)]. Despite the versatile nature of their spe- 
cific substrates, all DHH superfamily members play sig- 
nificant physiological roles and are important in mainte- 
nance of homeostasis. DHH superfamily in bacteria can 
be divided into three distinct subfamilies on the basis of 
specific substrate recognition: (i) RecJ or RecJ-like 5 / -3 / 
exonucleases (DHH/DHHA1 subfamily), (ii) manganese- 
dependent family II pyrophosphatases (DHH/DHHA2 
subfamily) and (iii) the recently discovered nanoRNases 
(DHH/DHHA1 subfamily). 

Multiple nanoRNases of DHH/DHHA1 subfamily 
[NrnA and NrnB in B. subtilis (5,6), Mpnl40 in M. pneu- 
moniae, Rv2837c in M. tuberculosis (10), nanoRNase of 
B. fragilis (8)] and DEDD subfamily (NrnC in Bartonella 
henselae) (7) have been identified. However, the phylo- 
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genetic origin of these multiple nanoRNase families in 
prokaryotes is unclear. On the basis of its gene location 
and synteny conservation, we show for the first time that 
DHH/DHHA1 subfamily of NrnA may have another level 
of functional divergence and it may have evolved to carry 
out distinct in vivo functions. 

NanoRNases of different bacteria are present in two distinct 
families 

Rv2837c is the only Afunctional enzyme with both nanoR- 
Nase and pAp phosphatase activity identified in M. tuber- 
culosis so far (10). We have characterized MSMEG_2630 
of M. smegmatis and attributed this enzyme with similar 
biochemical properties. However, mycobacteria and a few 
other actinobacteria are unique in encoding a nanoRNase, 
an ORN and CysQ in their genomes (5,43-44). Differing 
substrate specificities in multiple nanoRNA degrading en- 
zymes in bacteria have been suggested earlier (1). In order to 
seek greater insights into the physiological role of different 
nanoRNA-processing enzymes in mycobacteria (or Actino- 
mycetes), we revisited the locus of NrnA in different organ- 
isms for clues to its possible physiological function. Phy- 
logenetic analysis and examination of neighboring genes 
revealed clustering of sequences into two distinct groups: 
operons containing infB and rbfA genes and those lacking 
these. In addition to infB and rbfA, MATE '/ dinF encoding 
for MATE efflux pump is present on the same operon in 
both M. tuberculosis and M. smegmatis genomes. MATE 
efflux pump is involved in the efflux of toxic compounds 
(25). The rescue of sensitive phenotype in M. smegmatis 
AMSMEG_2630 to toxic compounds by complementation 
with MSMEGJ630 of M. smegmatis or rv2837c of M. tu- 
berculosis (in case of EtBr) suggests all these genes may be 
involved in related functions of DNA damage tolerance and 
repair. The redundant presence of ORN and nanoRNase 
only among Actinobacteria leads to a somewhat provoca- 
tive assumption that while the role of ORN in mycobacte- 
ria may be in generalized nanoRNA degradation, NrnA 7 
may be specialized to be involved in DNA repair-related 
activities. In a recent study, the possible coevolution of 
nanoRNases and their association with the RNA degra- 
dation machinery of the bacterial cell has been suggested 
(45). The coevolution with RNA degradasome and synteny 
conservation of MSMEG_2630 and Rv2837c with infB and 
rbfA hence suggest that mycobacterial nanoRNases may 
have a dual role in DNA-repair related as well as in the 
transcription/ translation fidelity in the cell. 

New packaging of an old fold for mechanistic conservation 

Despite a large number of structures in PDB, the total num- 
ber of folds is limited (46). Proteins use these limited folds 
in different permutations to achieve a functional and mech- 
anistic convergence for related activities. Functional con- 
vergence in several enzymes through conservation of ac- 
tive site residues despite completely different structures is a 
common feature in bacterial enzymes. For instance, despite 
completely different sequence and structure, family I and 
family II pyrophosphatases remain mechanistically analo- 
gous to each other through conservation of their active sites 



(32). Similar examples of convergent evolution to conserve 
their mechanisms mean that Trypsin- or subtilisin-families 
use the same catalytic triad for protein hydrolysis (47). Sim- 
ilarly, despite completely different three-dimensional struc- 
tures, the active site and catalytic mechanism of topoiso- 
merase and site-specific recombinases are similar (48). 

NanoRNases achieve a functional conservation and mul- 
tiple activities in a unique way. The three-dimensional struc- 
tures of either the NTD or CTD of NrnA, RecJ and family 
II pyrophosphatases are very similar. The loss of exonucle- 
ase activity on DNA while retaining phosphodiesterase as 
well as phosphatase activity by the DHH domain (residues 
1-211) of MSMEG_2630 suggests that the DHH domain 
confers activity while the DHHA1 domain confers sub- 
strate recognition in these proteins. However, completely 
different packing of the subunits in the three proteins re- 
sults in recognition of different substrates in a highly specific 
manner. This is achieved through several minor alterations: 
(i) alternate packing of subunits via additional interac- 
tions in the N-terminal DHH domain, C-terminal DHHA1 
domain and linker region of MSMEG_2630 against (3- 
sheet extension in the N-terminal DHH domain of py- 
rophosphatases, giving rise to completely different quater- 
nary structures, (ii) Extension of the linker between DHH 
and DHHA1 domains of MSMEG_2630, to enlarge the 
substrate binding cavity for binding of oligonucleotides, 
(iii) Insertion of an additional OB-fold between the DHH 
and DHHA1 domains of RecJ to enlarge domain inter- 
face forcing the protein to remain a monomer. Intriguingly, 
apart from the above modifications, all these proteins har- 
bor structurally similar N-terminal DHH or C-terminal 
DHHA1/DHHA2 domains. Hence, through combinations 
of previously existing DHH and DHHA1 domains, nanoR- 
Nases achieve a new quaternary arrangement enabling it to 
achieve multiple functions in a mechanistically similar man- 
ner. This mode of conservation of mechanism is also evolu- 
tionarily advantageous to the cell by avoiding encoding of 
multiple folds in its genome. 



CONCLUSIONS 

Permutations of few available structural and sequence mo- 
tifs are utilized by bacteria to achieve multiple, function- 
ally distinct activities. The structure of MSMEG_2630 
nanoRNase reveals a new way of oligomerization in 
exopolyphosphatase-like subunits to achieve a similar 
metal-dependent activity on longer oligonucleotide sub- 
strates. 

ACCESSION NUMBERS 

The coordinates of MSMEG_2630 have been deposited 
with PDB with the accession code 4LS9. Accession codes 
of other PDB IDs analyzed: 1DJL, 1D3V, 1G5B, 3W5W, 
3DMA, 3DEV, 1IR6, 2ZXP, 1174 and 3G98. 
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